Merge job data into individual level datasets - numerical values
The script below shows how to work with job data from the A scheme. This was reviewed in our theme course, which was run twice in 2022.
This concrete example demonstrates how to link data with employment as entity type together with another dataset with person as entity type. This is useful when you want to create job statistics on an individual level.
Data from the so-called A scheme with the prefix ARBLONN_ARB_
has employment relationship as unit type (i.e. up to several records per person). In order to be able to connect such data together with normal personal datasets, one must first use the collapse
command to sum up the information relating to each individual employment relationship per individual. The dataset is then transformed into person as the unit type (one record per person), and can thus be linked together with the personal dataset.
The variable ARBEIDSFORHOLD_PERSON
is used to link/aggregate from job to person level (the variable contains the person identifier associated with the relevant employment/job).
The procedure below is suitable for numerical job information such as e.g. working hours and vacancy rate. If you want to link categorical job information, the solution in this example is recommended: [Merge job data into individual level datasets - categorical values](i18n\en\docusaurus-plugin-content-docs\current\eksempel\Tema - Jobb- og inntektsanalyser\Koble sammen data om arbeidsforhold på persondatasett - kategoriske opplysninger.md).
require no.ssb.fdb:30 as db
// Create dataset for employed (person data)
create-dataset employed
import db/ARBLONN_PERS_KJOENN 2021-07-16 as gender
import db/ARBLONN_PERS_ALDER 2021-07-16 as age
// Create dataset for employment relations (job data)
create-dataset employment_relations
import db/ARBLONN_ARB_ARBEIDSTID 2021-07-16 as worktime
import db/ARBLONN_ARB_STILLINGSPST 2021-07-16 as position_percentage
import db/ARBEIDSFORHOLD_PERSON as personid
// Aggregate from job data to person data level by summing work time and job percentage per person. Then merge job information on the person dataset employed
collapse (sum) worktime position_percentage, by(personid)
merge worktime position_percentage into employed
// Create job statistics
use employed
summarize worktime position_percentage
tabulate gender, summarize(worktime)
tabulate gender, summarize(position_percentage)
generate age_group = 1
replace age_group = 2 if age > 25
replace age_group = 3 if age > 40
replace age_group = 4 if age > 60
define-labels age_labels 1 '0-25' 2 '26-40' 3 '41-60' 4 '61->'
assign-labels age_group age_labels
tabulate age_group gender, summarize(worktime)
tabulate age_group gender, summarize(position_percentage)